CDPEAC Quick-Reference 


CDPEAC: CM-5 Vector Unit Programming in C 

This document describes the CDPEAC instruction set, used for writing 
C programs that access the CM-5’s Vector Unit (VU) accelerators. 

Note: This is a preliminary version of a forthcoming document on CDPEAC. 
Please send any comments and/or corrections to: traveler@thi.nk. com 

Syntax Conventions Used In This Document: 


{a,b...} 
[x] 

bold 


register 

name 


Selection; you must choose a or b or... 

Optional part; you may include x 
Indicates opcode or suffix that can be added to opcode 
(Also used to indicate register names.) 
Metavariable; replaced by a value or symbol 
(typically indicated by a list of valid replacements) 


1 CDPEAC Syntax 

A CDPEAC program consists of C code with embedded CDPEAC statements. 
These statements are expanded during compilation into code that controls the 
CM-5’s Vector Units. 

A CDPEAC statement is one of the following: 

■ a VU Instruction 

■ a VU Accessor Instruction 
* a VU Special Instruction 
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A VU Instruction corresponds to a scalar or vector operation performed by the 
Vector Units, and is either: 

■ a VU Arithmetic operator, which performs an ALU operation: 

addv(i, VO, VI, V2) /* vector add (V2=V0+V1) */ 

* a VU Memory operator, which performs a memory load or store: 

loadv(i, address, VO) /* load values into VO */ 

■ a VU Statement Modifier, which affects statement compilation: 

vnsnode (cond) /* Vector mask, conditionalization */ 

* or some combination of the above types, made with the join operator: 

join3 (addv{i, VO, VI, V2) , loadv(i, address, VO) , vxmnode (cond) ) 

A VU Accessor Instruction is an instruction that executes on the CM-5 node 
microprocessor (the SPARC), but modifies the contents of VU registers or paral¬ 
lel memory: 

dpwrt (i,ALL_DPS,sp_src, RO) /* Write VU data register */ 
dpget (i,DP_1,dp_stride_memory) /* Get memory stride */ 

A VU Special Instruction is an instruction not in either of the above two classes, 
which peforms some useful operation on the SPARC and/or VUs. 

set_vector_length {8) /* Set default vector length */ 

ldvm(RO) /* Set contents of dp_vector_mask register */ 


1.1 The Join Operator 

The join operator connects arithmetic operations, memory operations, and 
statement modifiers to form compound CDPEAC statements: 

join (instructionl, instruction2) — default join, same as join2 
joinW(instructionl, ..., instruction//) —//-way join 
N = {1,2,3,4,5,6,7,8,91 

A join can have at most one arithmetic and one memory operation, but any 
number of modifiers from 0 to 7. The N of a join N must match the total number 
of instructons (operations and modifiers) supplied to the join N. 


Version 1.0, February 1993 
csirtyiriaht © 1993 Thinking Machines Corporation 




CDPEAC Quick-Reference — Preliminary Version 3 

mmimw •; ' -.r+mmmmmmimmmm® 

1.2 Registers 

VU Data Registers: CDPEAC code generally refers to VU data registers. The 
128 VU data registers are referenced by the following symbolic names: 

RO - R127 All 128 Registers in sequential order, 

vo - vi5 Vector Regs (first in each vector, same as RO, R8 ... R120) 

SO - S15 Scalar Regs (single precision), same as RO - R15 

SO - S30 (even) Scalar Regs (double precision), same as RO - R30 (even) 

Vector Registers: The VU data registers are grouped in banks of 8, called vector 
registers. The special register names vo - V15 are used to refer to the first data 
register in each vector. When a vector instruction requires an “aligned vector” 
operand, the operand must be one of the vnn registers (or the equivalent R nn). 

Scalar Registers: Scalar VU operations only accept the scalar registers. These 
are SO - S15 (single word), or the even registers from so - S30 (double word). 
Scalar operations restrict their operands to the S nn (or equivalent Raw) registers. 

Register Restrictions: The RO and Rl registers are used to store immediate 
operands, so these registers should be used carefully. 

Register Offsets: You can use an offset to a data register to access it and those 
succeeding it in ram order as a vector (usually to access vnn elements). (See the 
dreg_x register modifier in Section 1.3 below.) 

Internal Registers: There are some VU internal registers that influence the 
execution of DPEAC instructions. Some important examples are: 

dp_stride_rs l Stride of srci operand in arithmetic instruction. 

dp_st ride_memory Stride of memory addresses in memory instruction. 
dp_vector_mask Context mask for vectored arithmetic operations. 
dp_vector_mask_mode Default vector conditionalization (masking) mode. 
dp_vector_length Default vector length for both types of instructions. 
dp_v«ctor_maskjbuff«r Copy of dp_yector_mask used to save/restore it. 

Important: The pair of VUs on a single chip (that is, VUs 0/1 and 2/3) actually 
share all these internal registers except for the two registers dp_vector_mask 
and dp_vector_mask_buf fer. This means that any change to a shared register 
affects both VUs that share it. 
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1.3 Register Modifiers 

These modifiers can be applied to any register argument in a CDPEAC operation 
to specify an offset, stride, or indirection for the register. 

Register offsets: 

dreg_x (dreg, index) Register offset (index must be a constant). 

If dreg is R nn, this refers to R(nn+index). 

Note: The dreg_x form can be the dreg argument in any modifier below. 

Register striding: (Note: Unit stride is 1 for singles, 2 for doubles) 

dreg With no modifier, use unit striding 

dreg_u (dreg, stride) Use given stride once 
scalar (dreg) Scalar striding, same as dreg_u (dreg, 0) 

SCALAR (dreg) Alternate name for scalar (dreg) 

Srcl register striding: (Note: Default srcl stride is dp_stride_rsl) 

dreg_u (dreg, mode) Use default stride (mode is a literal symbol) 

dreg_s (dreg, stride) Store stride as the srcl default and use it 
dreg_u_s (dreg, stride,set_stride) 

Use stride, and store set_stride as default 

Register indirection: 

dreg_i (dreg, ireg) Simple register indirection 
dreg_i (dreg, dreg_u (ireg,stride)) 

Register indirection.ireg striding 
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1.4 Common Abbreviations 


Common CDPEAC opcode suffixes: 


Type: Meaning: _ 

s Scalar operation — single elemental operation on given arguments 

v Vector operation — multiple elemental operation with striding 

_i Memory stride indirection (for memory operations) 

Immediate value in src2 argument (for arithmetic operations) 


_v 

_vs 

_vh 

_yhs 

Use explicit vector length 
Use and set vector length 
Vlcn from register field 
Vlen from register field 

(unsticky, vlen = constant or register) 
(sticky, vlen = constant or register) 
(unsticky, l+(bits 19:22 of reg)) 
(sticky, l+(bits 19:22 of reg)) 

CDPEAC Operand type symbols: 


Type: 

Meaning: 



u Unsigned single-precision (32 bit) integer 
du Unsigned double-precision (64 bit) integer 
i Signed single-precision (32 bit) integer 

di Signed double-precision (64 bit) integer 

f Single-precision (32 bit) float 

df Double-precision (64 bit) float 


1.5 Typical CDPEAC Operand Names 

VU memory address 
CDPEAC operation type 
source VU data registers (or immediate values) 
destination VU data register 
SPARC source register 
SPARC destination register 
VU data register 

data register being used for indirection 
VU control register 
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type 

sre,src<n> 

dest 

sp_src 

sp_dest 

dreg 

ireg 

creg 
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2 CDPEAC Arithmetic Instructions 

2.1 Monadic (One Source) Operators 

These operators perform an arithmetic operation on the single arc argument, and 
store the result in the de3t argument. 

Formats: 

opcode [s,v} [i] [type, src, dest) 
opcode {s, v)_{ v, vs, vh, vhs} [type, vlen, src, dest) 
type - (u, du, i, di, f, df} 


Opcodes: Types:_ Purpose: 


move 

{u, du, i, di, f, df} 

Move src to dest, no status generated 

test 

{u, du, i, di, f, df} 

Move src to dest and test 

not 

{u, du} 

Bitwise invert (dest = ~src) 

clas 

{f, df) 

Classify operand (dest = class of src) 

exp 

{f. df} 

Extract exponent from float 

mant 

{f, df} 

Extract mantissa with hidden bit 

ffb 

{u, du} 

Find first “1” bit 

neg 

{i, di, f, df} 

Negate (dest = 0 - src) 

abs 

{i, di, f, df} 

Absolute value (dest = | src | ) 

inv 

{f. df} 

Invert (dest = 1/src) 

sqrt 

{f, df} 

Square root (dest = sqrt (src)) 

isqt 

{f, df} 

Inverse root (dest = l/sqrt (src)) 


2.1.1 Convert Operator (Monadic with extra type argument) 

The to operator converts between data types (src is of typel, dest of type2). 

Format: 


opcode{ s,v} [i] (typel, type2[ r], src, dest) 
opcode {s, v}_{v, vs,vh,vhs) (typel,type2[r] , vlen, src,dest) 
typel, type2 = {u, du, i, di, f, df} 


Opcode: Typel: 

Type2: 

Purpose: 

to 

{u, du, i, di} 

{f,df} 

Convert integer to float 

to 

{f, df} 

{f. df} 

Convert to another precision 

to 

{f, df} 

{u, du, i, di}r 

Convert to integer (round) 

to 

{f. df) 

{u, du, i, di} 

Convert to integer (truncate) 
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2.1.2 Dyadic (Two Source) Operators: 

These operators perform an arithmetic operation on the srcl and src2 argu¬ 
ments, and store the result in the dest argument. 

Formats: 

opcode{a,v } [i] (type, srcl, src2, dest) 
opcode{3, v}_{v, vs, vh, vhs) {type, vlen, srcl, src2, dest) 
type = {u, du, i, di, f, df} 


Opcodes: Types: 

Purpose: 

add 

{u, du, i, di, f, df} 

Add (dest = srcl + src2) 

addc 

{u, du, i, di} 

Integer add with carry 

sub 

{u, du, i, di, f, df} 

Subtract (dest = srcl - src2) 

subc 

{u, du, i, di} 

Integer subtract with carry 

subr 

{u, du, i, di, f, df} 

Subtract reversed (dest = src2 - srcl) 

sbcc 

{u, du, i, di} 

Integer subtract reversed with carry 

mul 

{u, du, i, di, f, df} 

Multiplication (low 32/64 bits for ints) 

mulh 

{du, di} 

Integer multiply (high 64 bits) 

div 

{f, df} 

Divide (de3t = srcl / src2) 

enc 

{u, du} 

Make float from exp and mant (srcl, src2) 

shl 

{u, du} 

Shift left (dest = srcl « src2) 

shir 

{u, du} 

Shift left reversed (dest = src2 « srcl) 

shr 

{u, du, i, di} 

Shift right (dest = srcl » src2) 

shrr 

{u, du, i, di} 

Shift right reversed (dest = src2 » srcl) 

and 

{u, du) 

Bitwise logical and 

nand 

{u, du} 

Bitwise logical nand 

andc 

{u, du} 

Bitwise logical and, srl complemented 

or 

{u, du} 

Bitwise logical IOR 

nor 

{u, du} 

Bitwise logical NOR 

xor 

{u, du} 

Bitwise logical xor 

xnrg 

{u, du, i, di, f, df} 

If vector mask bit = 1 then srcl else src2 
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2.1.3 Arithmetic Comparisons: 

These operators perform an arithmetic comparison between the srcl and src2 
arguments, and set status flags accordingly. 

Format: 

opcode{a,v } [i] {type, srcl, src2) 
opcode [a, v}_{v, vs, vh, vhs} {type, vlen, srcl, src2) 
type = {u, du, i, di, f, df} 


Opcodes: Types:_ Purpose: 


gt 

{u, du, i, di, f, df) 

Greater than 

ge 

{u, du, i, di, f, df} 

Greater than or equal 

It 

{u, du, i, di, f, df} 

Less than 

le 

{u, du, i, di, f, df) 

Less than or equal 

eq 

{u, du, i, di, f, df} 

Equal 

ne 

{u, du, i, di, f, df} 

Not equal or unordered 

ig 

{u, du, i, di, f, df} 

Ordered and not equal 

un 

{u, du, i, di, f, df} 

Unordered 


2.1.4 Compare (Dyadic with Rd constant) 

The Compare operation tests for a numeric relationship between the srcl and 
src2 arguments, as indicated by the supplied constant code. 

Format: 


opcode Is, v} [i] {type, srcl, src2, code) 
opcode {s, v}_{v, vs, vh, vhs) {type, vlen, srcl, src2, code) 
type = {u, du, i, di, f, df} 


Opcode: Types: 

Code: 

Purpose: 

emp 

{u, du, i, di, f, df} 

0 

Test for greater than 

cop 

{u, du, i, di, f, df} 

1 

Test for equal 

emp 

{u, du, i, di, f, df} 

2 

Test for less than 

f mp 

{u, du, i, di, f, df} 

3 

Test for greater than or equal 

crap 

{u, du, i, di, f, df} 

4 

Test for unoidered (NaN present) 

cop 

{u, du, i, di, f, df} 

5 

Test for ordered and not equal 

emp 

{u, du, i, di, f, df} 

6 

Test for not equal or unordered 

cop 

{u, du, i, di, f, df} 

7 

Test for less than or equal 
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2.1.5 Dyadic Mult-Op Operators 

These operations perform a muliplication and an arithmetic (or logical) operation 
on the srcl, src2, and dest arguments, and store the result in dest. 

Format: 

opcodei s, v} [i] [type, srcl, src2,dest) 
opcode {s, v}_{ v, vs, vh, vhs} (type, vlen, srcl, src2, dest) 
type = {u, du, i, di, f, df} 

Note: In the opcode descriptions below, the optional [h] indicates that the high 
64 bits of the multiplication are to be used in the logical operation, rather than 
the low 64 bits (the default). 


Accumulative Operators 


Opcodes: Types: 

Purpose: 

mada 

{u, du, i, di, f, df} 

dest =(srcl * src2)+ dest 

xnsba 

{u, du, i, di, f, df} 

dest = (srcl * src2) - dest 

msra 

{u, du, i, di, f, df} 

dest = dest - (srcl * src2) 

xunaa 

{u, du, i, di, f, df} 

dest = -dest - (srcl * src2) 

m[h]sa 

{du} 

dest = (srcl * src2) AND dest 

m[h]ma 

{du} 

dest = (srcl * src2) AND NOT dest 

xn[h]oa 

{du} 

dest = (srcl * src2) IOR dest 

m[h]xa 

(du) 

dest = (srcl * src2) XOR dest 


Inverted Operators 

Opcodes: Types: 

Purpose: 

madi 

{u, du, i, di, f, df} 

dest = (src2 * dest) + srcl 

msbi 

{u, du, i, di, f, df} 

dest = (src2 * dest) - srcl 

msri 

{u, du, i, di, f, df} 

dest = srcl - (src2 * dest) 

nxnai 

{u, du, i, di, f, df} 

dest = -srcl - (src2 * dest) 

m[h]si 

{du} 

dest = (src2 * dest) AND srcl 

m[h]mi 

{du} 

dest = (src2 * dest) AND NOT srcl 

m[h]oi 

{du} 

dest = (src2 * dest) IOR srcl 

m[h]xi 

{du} 

dest = (src2 * dest) XOR srcl 
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2.1.6 Convert Operation (Dyadic with Rs2 constant) 

These operations convert the src argument to the type indicated by the constant 
code argument, and store the result in the dest argument. 

Format: 

opcode{3,v } [i] (type,src,code,dest) 
opcode {s, v}_{v, vs,vh,vhs) {type, vlen, src, code, dest) 
type = {i[r], f, fi) 
code = a C constant from the list below 


Opcode/Type: 

Code: 

Purpose: 

cvt 

i[r] 

CVTICD_F__I (4) 

Single float to single signed integer 

cvt 

i[r] 

CVTICD_F_U (5) 

Same, to unsigned integer 

cvt 

i[r] 

CVT ICD_F_D I (6) 

Single float to double signed integer 

cvt 

i[r] 

CVTICD_F_DU (7) 

Same, to unsigned integer 

cvt 

i[r] 

CVTICD_DF_I (12) 

Double float to single signed integer 

cvt 

i[r] 

CVTICD_DF_U (13) 

Same, to unsigned integer 

cvt 

i[r] 

CVTICD_DF_DI (14) 

Double float to double signed integer 

cvt 

i[r] 

CVT I CD_DF_DU (14) 

Same, to unsigned integer 

cvt 

f 

CVTFCD_F_DF (3) 

Single float to double float 

cvt 

f 

CVTFCD_DF_F (9) 

Double float to single float 

cvt 

fi 

CVTFICD_I_F (1) 

Single signed integer to single float 

cvt 

fi 

CVTFICD_U_F (5) 

Same, but from unsigned integer 

cvt 

fi 

CVTF ICD_I_DF (3) 

Single signed integer to double float 

cvt 

fi 

CVTF ICD_U_DF (7) 

Same, but from unsigned integer 

cvt 

f i 

CVTF ICD_D I_F (9) 

Double signed integer to single float 

cvt 

fi 

CVTFICD_DU_F (13) 

Same, but from unsigned integer 

cvt 

f i 

CVTFICD_DI_DF (11) 

Double signed integer to double float 

cvt 

f i 

CVTFICD DU DF (15) 

Same, but from unsigned integer 


Version 1.0, February 1993 
Copyright © 1993 Thinking Machines Corporation 



CDPEAC Quick-Reference—Preliminary Version 11 

2.1.7 True Triadic (Three Source) Operators 

These operations perform a muliplication and an arithmetic (or logical) operation 
on the srci, src2, and src3 arguments, and store the result in dest. 

Format: 

opcode{ s, v) [i] (type, srcl, src2, src3, dest) 
opcode {3, v}_{v, vs,vh, vha} (type, vlen, srcl, src2, src3, dest) 
type = {u, du, i, di, f, df} 

Note: In the opcode descriptions below, the optional [h] indicates that the high 
64 bits of the multiplication are to be used in the logical operation, rather than 
the low 64 bits (the default). 


Opcodes: Types: 

Purpose: 

madt 

{u, du, i, di, f, df} 

dest = (srcl* src2) + src3 

msbt 

{u, du, i, di, f, df} 

dest = (srcl * src2) - src3 

mart 

{u, du, i, di, f, df) 

dest = src3 - (srcl * src2) 

nmat 

{u, du, i, di, f, df} 

dest = -src3 - (srcl * src2) 

m[h]st 

{du} 

dest = (srcl * src2) AND src3 

m[h]mt 

{du} 

dest = (srcl * src2) AND NOT src3 

m[h]ot 

{du} 

dest = (srcl * src2) IOR src3 

m[h]xt 

{du} 

dest = (srcl * src2) XOR src3 

Important: 



When a triadic operators is joined with a memory operator, the src2 argument 
of the triadic must be identical to the dreg argument of the memory operator. 
(This restriction is imposed by the way such statements are assembled.) 


2.1.8 No-op Operator 

The untyped arithmetic no-op allows modifier side-effects without specifying 
an operation. The no-op takes no arguments. The suffixes are as described above. 

Format: 

f nop {s, v} () 

fnop {s, v}_{ v, vs, vh, vhs} () 
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3 CDPEAC Memory Operations 


These operations move data between VU memory and data registers. 

Note: the default memory stride is stored in dp_stride__memory. 

Formats: 

opcode {s, v) (type, address, dreg) 

— use default memory stride 
opcode {s, v)_u (type, address, stride, dreg) 

— use stride once 

opcode { s, v}_s (type, address, stride, dreg) 

— use stride and store it as default 

opcode { s, v}_u_s (type, address, stride, set_stride, dreg) 

— use stride, and store set_stride as default 
opcode { s, v}_i (type, address, ireg, dreg) 

— memory stride indirection 

opcode (s, v}_i (type, address, dreg_u (ireg, stride) , dreg) 

— memory indirection with stride on ireg 
opcode ( s, v}_{ v, vs, vh, vhs) (type, vlen, address, dreg) 

— explicit vector length for CDPEAC statement 
opcode ( s, v}_( v, vs, vh, vhs}_i (type, vlen, address, ireg, dreg) 

— vector length and memory stride indirection 
opcode { s, v}_{v, vs, vh, vhs}_u (type, vlen, address, cstride, dreg) 

— vector length and use-once stride 
type = {u, du, i, di, f, df} 

Opcode: Types: _ Purpose: _ 

load {u, du, i, di, f, df } Load from memory to VU data register 
store {u, du, i, di, f, df } Store from VU data register to memory 

No-Op Instruction: Untyped memory no-op allows modifier side-effects with¬ 
out a load or store. Suffixes and arguments are as in the load/store formats above. 

memnop (address) 

memnop_u (address, ustride) 

xnemnop_s (address, stride) 

memnop_u_s (address, stride, set_stride) 

xnemnop_i (address, idreg) 

memnop_{ v,vs,vh,vhs )(vlen, address) 

memnop_{v,vs,vh, vhs)_i (vlen, address, idreg) 

me n mop_{v,vs,vh,vhs)_u (vlen, address, ustride) 
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4 CDPEAC Statement Modifiers 


This section describes the statement modifiers that can be joined with arithme¬ 
tic and memory operations to affect their assembly and/or execution. Note: Some 
of these modifiers (such as the last three) can be used on their own. 

General Modifiers: 

nopad, pad(n) Vector length padding ( n = new length, default is 4 ) 
maddr(address) Memory address for statement lacking memory load/store 
[nojalign Doubleword alignment guarantee on memory operand 

Vector Mask Modifiers: 


vmmode[_s](mode) Vector mask conditionalization mode 

(_s version sets value of dp_vector_mas)c_mode) 


Mode: 

v mmode 

cond 

condalu 

condmem 

always 


Meaning: _ 

Use default vector mask mode (dp_vector_mask._mode) 

Full conditionalization 

Arithmetic operation only 

Memory operation only 

No conditionalization 


vmrotate, vmeurrent 
vminvert, vmtrue 
vxoold, vxnnew, vmnop 


Vector mask bit rotation 
Vector mask bit sense 
Vector mask copy mode 


Accumulated Context Count: 

vmcount[(v,s)] (dreg) Set dreg to count of l’s in vector mask 

Note:_s version is for scalar ops, _v for vector ops. (_y is the default.) 


VU Pair Data Exchange: 

exchange, noexchange Arithmetic results exchanged by pairs of 

VUs on the same chip 

Population Count: 

epc{s,v} (type, sre, dest ) Counts 1 bits in sre, stores total in dest 
type = ( u, du i 
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5 VU Accessor Instructions 

These accessor instructions are always used as single statements, execute on the 
node microprocessor (the SPARC), and generally move data between the SPARC 
and the VU, or affect values stored in SPARC registers. 

Data Register Read/Write Operators: These move data between SPARC Reg¬ 
isters and VU Data Registers: 

dpwrt [_sync,_nosync] {type, selector, sp_src, dreg) 
dprd[_sync,_nosync] (type, selector, dreg, sp_dest) 
type = {u, du, i, di, f, df) 

sync/nosync = whether to sync VU pipeline (default is sync) 


Control Register Read/Write Operators: These move values between SPARC 
Registers and VU Control Registers: 

dpset [_supervisor] (type, selector, sp_src, creg) 
dpget [_supervisor] (type, selector, creg, sp_dest) 
type = (u, dn, i, di, f, df) 
supervisor = get/set in supervisor region 


Parallel Memory Load/Store Operators: These move values between SPARC 
registers and VU parallel memory: 

dpld. (type , address, sp_dest) 
dpst (type, sp_src, address) 
type = {u, du, i, di, f, df} 


Memory Space/Bank Conversions: These operators modify the memory 
address in the src register to point to a different space/bank of VU memory, and 
store the modified address in dest. 

dpchgsp ( s rc, de s t ) Toggle between data/instruction spaces 

dpchgbk (src, selector,dest) Change referenced VU region 


VU Pipeline Sync: This operator prevents the preceding and following 
CDPEAC statements from overlapping in the VU pipeline: 

dpaync() 
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CDPEAC Function Setup/Cleanup: 

dpsetup () Initializes the VU registers for use with CDPEAC code; 

must appear at start of block of CDPEAC code. 

dpcieanup {) Restores state of VU registers required for CM Run-Time 

System code. Must appear at end of a block of CDPEAC 
code that can be called by CMRTS. 


6 VU Special Instructions 

These control operations are always used as single statements, and typically per¬ 
form some useful operation on VU or SPARC registers and/or memory locations. 

VU Internal Register Modifiers: These operations expand into CDPEAC 
instructions with special modifier flags that set the values of one or more of the 
following VU internal registers: 

dp_vector_ma 3 k_mode Default vector mask mode 
dp_stride_memory Default memory stride 

dp_atride_rsi Default srcl register stride 

dp_vector_length Default vector length 

set_vmmode (vmmode) Sets dp_vector_mask_mode to vmmode 

set_mem_stri.de ( stride ) Sets dp_stride_memory to Stride 

set_rsl_stride (rsl_stride) Sets dp_stride_rsl to rts_stride 
set_vector_length (vlen) Sets dp_vector_length to vlen 

set_vector_length_and_vmmode (vlen, vmmode) 
set_vector_length_and_rsl_stride (vlen, rsl_stride) 
set_vect or_lengt h_and_rs l_s t ride_and_vmxnode 

(vlen,rsl_stride,vmmode) 

Vector Mask Load/Store: These operators move the value of the vector mask 
register to or from the specified VU data register (dreg). 

ldvm(dreg) 

stvm(dreg) 
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7 Special Notes and Restrictions 

Register Stride Restrictions: 

When you apply a stride of 0 to the srcl argument of an arithmetic operation 
(for example, dreg__u (RO, 0)), the srcl register must be one of the scalar 
registers so through S15 (or S30 for double precision). 

Src2 Operand Restrictions: 

The src2 operand of an arithmetic instruction has the following restrictions: 

■ For vector operations, s rc2 cannot be any of RO through R7 , by any 
name (so, vo, etc.). 

■ In scalar operations, src2 cannot be any of r nn, where nn is any 
multiple of 16 (for single-precision) or 32 (for double-precision). 

(For the Curious: This restriction is imposed by the way CDPEAC opera¬ 
tions are represented internally.) 

THadic Operator Restrictions: 

When a triadic arithmetic operation and a memory operation are joined, the 
src2 operand of the arithmetic operation must be identical to the dreg operand 
of the memory operation. 

Double Precision Move Immediate: 

Double-precision move operations only use the upper 32 bits of an immediate 
source operand. Thus, operands with any non-zero bits in the lower 32 bits 
cannot be specified. 
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